llm therapist
Can You Share Your Story? Modeling Clients' Metacognition and Openness for LLM Therapist Evaluation
Kim, Minju, Yoo, Dongje, Hwang, Yeonjun, Kang, Minseok, Kim, Namyoung, Gwak, Minju, Kwak, Beong-woo, Chae, Hyungjoo, Kim, Harim, Lee, Yunjoong, Kim, Min Hee, Jung, Dayi, Chung, Kyong-Mee, Yeo, Jinyoung
Understanding clients' thoughts and beliefs is fundamental in counseling, yet current evaluations of LLM therapists often fail to assess this ability. Existing evaluation methods rely on client simulators that clearly disclose internal states to the therapist, making it difficult to determine whether an LLM therapist can uncover unexpressed perspectives. To address this limitation, we introduce MindVoyager, a novel evaluation framework featuring a controllable and realistic client simulator which dynamically adapts itself based on the ongoing counseling session, offering a more realistic and challenging evaluation environment. We further introduce evaluation metrics that assess the exploration ability of LLM therapists by measuring their thorough understanding of client's beliefs and thoughts.
Towards a Client-Centered Assessment of LLM Therapists by Client Simulation
Wang, Jiashuo, Xiao, Yang, Li, Yanran, Song, Changhe, Xu, Chunpu, Tan, Chenhao, Li, Wenjie
Although there is a growing belief that LLMs can be used as therapists, exploring LLMs' capabilities and inefficacy, particularly from the client's perspective, is limited. This work focuses on a client-centered assessment of LLM therapists with the involvement of simulated clients, a standard approach in clinical medical education. However, there are two challenges when applying the approach to assess LLM therapists at scale. Ethically, asking humans to frequently mimic clients and exposing them to potentially harmful LLM outputs can be risky and unsafe. Technically, it can be difficult to consistently compare the performances of different LLM therapists interacting with the same client. To this end, we adopt LLMs to simulate clients and propose ClientCAST, a client-centered approach to assessing LLM therapists by client simulation. Specifically, the simulated client is utilized to interact with LLM therapists and complete questionnaires related to the interaction. Based on the questionnaire results, we assess LLM therapists from three client-centered aspects: session outcome, therapeutic alliance, and self-reported feelings. We conduct experiments to examine the reliability of ClientCAST and use it to evaluate LLMs therapists implemented by Claude-3, GPT-3.5, LLaMA3-70B, and Mixtral 8*7B. Codes are released at https://github.com/wangjs9/ClientCAST.
- Europe > Ireland (0.04)
- North America > United States > Texas > Travis County > Austin (0.04)
- North America > United States > New Mexico > Bernalillo County > Albuquerque (0.04)
- (3 more...)
- Questionnaire & Opinion Survey (1.00)
- Personal > Interview (0.46)
- Research Report > New Finding (0.46)
A Computational Framework for Behavioral Assessment of LLM Therapists
Chiu, Yu Ying, Sharma, Ashish, Lin, Inna Wanyin, Althoff, Tim
The emergence of ChatGPT and other large language models (LLMs) has greatly increased interest in utilizing LLMs as therapists to support individuals struggling with mental health challenges. However, due to the lack of systematic studies, our understanding of how LLM therapists behave, i.e., ways in which they respond to clients, is significantly limited. Understanding their behavior across a wide range of clients and situations is crucial to accurately assess their capabilities and limitations in the high-risk setting of mental health, where undesirable behaviors can lead to severe consequences. In this paper, we propose BOLT, a novel computational framework to study the conversational behavior of LLMs when employed as therapists. We develop an in-context learning method to quantitatively measure the behavior of LLMs based on 13 different psychotherapy techniques including reflections, questions, solutions, normalizing, and psychoeducation. Subsequently, we compare the behavior of LLM therapists against that of high- and low-quality human therapy, and study how their behavior can be modulated to better reflect behaviors observed in high-quality therapy. Our analysis of GPT and Llama-variants reveals that these LLMs often resemble behaviors more commonly exhibited in low-quality therapy rather than high-quality therapy, such as offering a higher degree of problem-solving advice when clients share emotions, which is against typical recommendations. At the same time, unlike low-quality therapy, LLMs reflect significantly more upon clients' needs and strengths. Our analysis framework suggests that despite the ability of LLMs to generate anecdotal examples that appear similar to human therapists, LLM therapists are currently not fully consistent with high-quality care, and thus require additional research to ensure quality care.
- North America > United States > Texas > Travis County > Austin (0.04)
- North America > United States > New Mexico > Bernalillo County > Albuquerque (0.04)
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.04)
- North America > Mexico (0.04)
- Personal > Interview (1.00)
- Research Report > Experimental Study (0.94)
- Research Report > New Finding (0.68)
- Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Mental Health (1.00)
- Health & Medicine > Consumer Health (1.00)